{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "# Multivariate Dependencies Beyond Shannon Information\n",
    "\n",
    "This is a companion Jupyter notebook to the work *Multivariate Dependencies Beyond Shannon Information* by Ryan G. James and James P. Crutchfield. This worksheet was written by Ryan G. James. It primarily makes use of the ``dit`` package for information theory calculations."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Basic Imports\n",
    "\n",
    "We first import basic functionality. Further functionality will be imported as needed."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false,
    "hide_input": false,
    "init_cell": true
   },
   "outputs": [],
   "source": [
    "import numpy as np\n",
    "import matplotlib.pyplot as plt\n",
    "%matplotlib inline\n",
    "\n",
    "from dit import ditParams, Distribution\n",
    "from dit.distconst import uniform\n",
    "\n",
    "ditParams['repr.print'] = ditParams['print.exact'] = True"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "## Distributions\n",
    "\n",
    "Here we define the two distributions to be compared."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "from dit.example_dists.mdbsi import dyadic, triadic\n",
    "\n",
    "dists = [('dyadic', dyadic), ('triadic', triadic)]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "## I-Diagrams and X-Diagrams\n",
    "\n",
    "Here we construct the I- and X-Diagrams of both distributions. The I-Diagram is constructed by considering how the entropies of each variable interact. The X-Diagram is similar, but considers how the extropies of each variable interact."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true,
    "slideshow": {
     "slide_type": "-"
    }
   },
   "outputs": [],
   "source": [
    "from dit.profiles import ExtropyPartition, ShannonPartition\n",
    "\n",
    "def print_partition(dists, partition):\n",
    "    ps = [str(partition(dist)).split('\\n') for _, dist in dists ]\n",
    "    print('\\t' + '\\t\\t\\t\\t'.join(name for name, _ in dists))\n",
    "    for lines in zip(*ps):\n",
    "        print('\\t\\t'.join(lines))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false,
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "outputs": [],
   "source": [
    "print_partition(dists, ShannonPartition)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Both I-Diagrams are the same. This implies that *no* Shannon measure (entropy, mutual information, conditional mutual information [including the transfer entropy], co-information, etc) can differentiate these patterns of dependency."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false,
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "outputs": [],
   "source": [
    "print_partition(dists, ExtropyPartition)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Similarly, the X-Diagrams are identical and so no extropy-based measure can differentiate the distributions."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "## Measures of Mutual and Common Information\n",
    "\n",
    "We now compute several measures of mutual and common information:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false,
    "slideshow": {
     "slide_type": "notes"
    }
   },
   "outputs": [],
   "source": [
    "try:\n",
    "    from prettytable import PrettyTable\n",
    "except ImportError:\n",
    "    from pltable import PrettyTable\n",
    "\n",
    "from dit.multivariate import (entropy,\n",
    "                              coinformation,\n",
    "                              total_correlation,\n",
    "                              dual_total_correlation,\n",
    "                              independent_information,\n",
    "                              caekl_mutual_information,\n",
    "                              interaction_information,\n",
    "                              intrinsic_total_correlation,\n",
    "                              gk_common_information,\n",
    "                              wyner_common_information,\n",
    "                              exact_common_information,\n",
    "                              functional_common_information,\n",
    "                              mss_common_information,\n",
    "                              tse_complexity,\n",
    "                             )\n",
    "\n",
    "from dit.other import (extropy,\n",
    "                       disequilibrium,\n",
    "                       perplexity,\n",
    "                       LMPR_complexity,\n",
    "                       renyi_entropy,\n",
    "                       tsallis_entropy,\n",
    "                      )"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false,
    "slideshow": {
     "slide_type": "notes"
    }
   },
   "outputs": [],
   "source": [
    "def print_table(title, table, dists):\n",
    "    pt = PrettyTable(field_names = [''] + [name for name, _ in table])\n",
    "    for name, _ in table:\n",
    "        pt.float_format[name] = ' 5.{0}'.format(3)\n",
    "    for name, dist in dists:\n",
    "        pt.add_row([name] + [measure(dist) for _, measure in table])\n",
    "\n",
    "    print(\"\\n{}\".format(title))\n",
    "    print(pt.get_string())"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Entropies\n",
    "\n",
    "Entropies generally capture the uncertainty contained in a distribution. Here, we compute the Shannon entropy, the Renyi entropy of order 2 (also known as the collision entropy), and the Tsallis entropy of order 2. Though we only compute the order 2 values, any order will produce values identical for both distributions."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true,
    "slideshow": {
     "slide_type": "notes"
    }
   },
   "outputs": [],
   "source": [
    "entropies = [('H', entropy),\n",
    "             ('Renyi (α=2)', lambda d: renyi_entropy(d, 2)),\n",
    "             ('Tsallis (q=2)', lambda d: tsallis_entropy(d, 2)),\n",
    "            ]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false,
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "outputs": [],
   "source": [
    "print_table('Entropies', entropies, dists)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "-"
    }
   },
   "source": [
    "The entropies for both distributions are indentical. This is not surprising: they have the same probability mass function."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Mutual Informations\n",
    "\n",
    "Mutual informations are multivariate generalizations of the standard Shannon mutual information. By far, the most widely used (and often simply assumed to be the only) generalization is the total correlation, sometimes called the multi-information. It is defined as:\n",
    "$$\n",
    " T[\\mathbf{X}] = \\sum H[X_i] - H[\\mathbf{X}] = \\sum p(\\mathbf{x}) \\log_2 \\frac{p(\\mathbf{x})}{p(x_1)p(x_2)\\ldots p(x_n)}\n",
    "$$\n",
    "\n",
    "Other generalizations exist, though, including the co-information, the dual total correlation, and the CAEKL mutual information."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true,
    "slideshow": {
     "slide_type": "notes"
    }
   },
   "outputs": [],
   "source": [
    "mutual_informations = [('I', coinformation),\n",
    "                       ('T', total_correlation),\n",
    "                       ('B', dual_total_correlation),\n",
    "                       ('J', caekl_mutual_information),\n",
    "                       ('II', interaction_information),\n",
    "                      ]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false,
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "outputs": [],
   "source": [
    "print_table('Mutual Informations', mutual_informations, dists)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "-"
    }
   },
   "source": [
    "The equivalence of all these generalizations is not surprising: Each of them can be defined as a function of the I-diagram, and so must be identical here."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Common Informations\n",
    "\n",
    "Common informations are generally defined using an auxilliary random variable which captures some amount of information shared by the variables of interest. For all but the Gács-Körner common information, that shared information is the dual total correlation."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true,
    "slideshow": {
     "slide_type": "notes"
    }
   },
   "outputs": [],
   "source": [
    "common_informations = [('K', gk_common_information),\n",
    "                       ('C', lambda d: wyner_common_information(d, niter=1, polish=False)),\n",
    "                       ('G', lambda d: exact_common_information(d, niter=1, polish=False)),\n",
    "                       ('F', functional_common_information),\n",
    "                       ('M', mss_common_information),\n",
    "                      ]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false,
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "outputs": [],
   "source": [
    "print_table('Common Informations', common_informations, dists)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "-"
    }
   },
   "source": [
    "As it turns out, only the Gács-Körner common information, `K`, distinguishes the two."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Other Measures\n",
    "\n",
    "Here we list a variety of other information measures."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false,
    "slideshow": {
     "slide_type": "notes"
    }
   },
   "outputs": [],
   "source": [
    "other_measures = [('IMI', lambda d: intrinsic_total_correlation(d, d.rvs[:-1], d.rvs[-1])),\n",
    "                  ('X', extropy),\n",
    "                  ('R', independent_information),\n",
    "                  ('P', perplexity),\n",
    "                  ('D', disequilibrium),\n",
    "                  ('LMRP', LMPR_complexity),\n",
    "                  ('TSE', tse_complexity),\n",
    "                 ]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false,
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "outputs": [],
   "source": [
    "print_table('Other Measures', other_measures, dists)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "-"
    }
   },
   "source": [
    "Several other measures fail to differentiate our two distributions. For many of these (`X`, `P`, `D`, `LMRP`) this is because they are defined relative to the probability mass function. For the others, it is due to the equality of the I-diagrams. Only the intrinsic mutual information, `IMI`, can distinguish the two."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "## Information Profiles\n",
    "\n",
    "Lastly, we consider several \"profiles\" of the information."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false,
    "slideshow": {
     "slide_type": "fragment"
    }
   },
   "outputs": [],
   "source": [
    "from dit.profiles import *\n",
    "\n",
    "def plot_profile(dists, profile):\n",
    "    n = len(dists)\n",
    "    plt.figure(figsize=(8*n, 6))\n",
    "    ent = max(entropy(dist) for _, dist in dists)\n",
    "    for i, (name, dist) in enumerate(dists):\n",
    "        ax = plt.subplot(1, n, i+1)\n",
    "        profile(dist).draw(ax=ax)\n",
    "        if profile not in [EntropyTriangle, EntropyTriangle2]:\n",
    "            ax.set_ylim((-0.1, ent + 0.1))\n",
    "        ax.set_title(name)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "### Complexity Profile"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "plot_profile(dists, ComplexityProfile)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Once again, these two profiles are identical due to the I-Diagrams being identical. The complexity profile incorrectly suggests that there is no information at the scale of 3 variables."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "### Marginal Utility of Information"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "plot_profile(dists, MUIProfile)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The marginal utility of information is based on a linear programming problem with constrains related to values from the I-Diagram, and so here again the two distributions are undifferentiated."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "### Connected Informations"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "plot_profile(dists, SchneidmanProfile)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The connected informations are based on differences between maximum entropy distributions with differing $k$-way marginal distributions fixed. Here, the two distributions are differentiated "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "subslide"
    }
   },
   "source": [
    "### Multivariate Entropy Triangle"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "plot_profile(dists, EntropyTriangle)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Both distributions are at an idential location in the multivariate entropy triangle."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "slideshow": {
     "slide_type": "slide"
    }
   },
   "source": [
    "## Partial Information\n",
    "\n",
    "We next consider a variety of partial information decompositions."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true,
    "slideshow": {
     "slide_type": "notes"
    }
   },
   "outputs": [],
   "source": [
    "from dit.pid.helpers import compare_measures"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false,
    "slideshow": {
     "slide_type": "notes"
    }
   },
   "outputs": [],
   "source": [
    "for name, dist in dists:\n",
    "    compare_measures(dist, name=name)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Here we see that the PID determines that in dyadic distribution two random variables uniquely contribute a bit of information to the third, whereas in the triadic distribution two random variables redundantly influene the third with one bit, and synergistically with another."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Multivariate Extensions"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "from itertools import product"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "outcomes_a = [\n",
    "    (0,0,0,0),\n",
    "    (0,2,3,2),\n",
    "    (1,0,2,1),\n",
    "    (1,2,1,3),\n",
    "    (2,1,3,3),\n",
    "    (2,3,0,1),\n",
    "    (3,1,1,2),\n",
    "    (3,3,2,0),\n",
    "]\n",
    "outcomes_b = [\n",
    "    (0,0,0,0),\n",
    "    (0,0,1,1),\n",
    "    (0,1,0,1),\n",
    "    (0,1,1,0),\n",
    "    (1,0,0,1),\n",
    "    (1,0,1,0),\n",
    "    (1,1,0,0),\n",
    "    (1,1,1,1),\n",
    "]\n",
    "\n",
    "outcomes = [ tuple([2*a+b for a, b in zip(a_, b_)]) for a_, b_ in product(outcomes_a, outcomes_b) ]\n",
    "quadradic = uniform(outcomes)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "dyadic2 = uniform([(4*a+2*c+e, 4*a+2*d+f, 4*b+2*c+f, 4*b+2*d+e) for a, b, c, d, e, f in product([0,1], repeat=6)])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": true
   },
   "outputs": [],
   "source": [
    "dists2 = [('dyadic2', dyadic2), ('quadradic', quadradic)]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "print_partition(dists2, ShannonPartition)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "print_partition(dists2, ExtropyPartition)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "print_table('Entropies', entropies, dists2)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "print_table('Mutual Informations', mutual_informations, dists2)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "print_table('Common Informations', common_informations, dists2)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "print_table('Other Measures', other_measures, dists2)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "plot_profile(dists2, ComplexityProfile)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "plot_profile(dists2, MUIProfile)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "plot_profile(dists2, SchneidmanProfile)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "plot_profile(dists2, EntropyTriangle)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.6.6"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}